Speaker recognition models
نویسندگان
چکیده
This paper evaluates continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisa-tion (VQ) for speaker recognition, across incremen-tal amounts of training data. In comparing VQ and CDHMMs for text-independent (TI) speaker recognition , it is shown that VQ performs better than an equivalent CDHMM with one training version, but is outperformed by the CDHMM when trained with ten training versions. In text-dependent (TD) experiments , a comparison of DTW, VQ and CDHMMs shows that DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data, the performance of each model is indistinguishable. Further analysis shows TD to be superior to TI architecture for speaker recognition, and TD digit performance illustrates zero, 1 and 9 to be good discrim-inators.
منابع مشابه
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کامل